Stochastic matching pursuit for Bayesian variable selection

نویسندگان

  • Ray-Bing Chen
  • Chi-Hsiang Chu
  • Te-You Lai
  • Ying Nian Wu
چکیده

This article proposes a stochastic version of the matching pursuit algorithm for Bayesian variable selection in linear regression. In the Bayesian formulation, the prior distribution of each regression coefficient is assumed to be a mixture of a point mass at 0 and a normal distribution with zero mean and a large variance. The proposed stochastic matching pursuit algorithm is designed for sampling from the posterior distribution of the coefficients for the purpose of variable selection. The proposed algorithm can be considered a modification of the componentwise Gibbs sampler. In the componentwise Gibbs sampler, the variables are visited by a random or a systematic scan. In the stochastic matching pursuit algorithm, the variables that better align with the current residual vector are given higher probabilities of being visited. The proposed algorithm combines the efficiency of the matching pursuit algorithm and the Bayesian formulation with well defined prior distributions on coefficients. Several simulated examples of small n and large p are used to illustrate the algorithm. These examples show that the algorithm is efficient for screening and selecting variables.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Deep Learning: A Bayesian Perspective

Deep learning is a form of machine learning for nonlinear high dimensional pattern matching and prediction. By taking a Bayesian probabilistic perspective, we provide a number of insights into more efficient algorithms for optimisation and hyper-parameter tuning. Traditional high-dimensional data reduction techniques, such as principal component analysis (PCA), partial least squares (PLS), redu...

متن کامل

Ultra-high Dimensional Multiple Output Learning With Simultaneous Orthogonal Matching Pursuit: Screening Approach

We propose a novel application of the Simultaneous Orthogonal Matching Pursuit (SOMP) procedure to perform variable selection in ultra-high dimensional multiple output regression problems, which is the first attempt to utilize multiple outputs to perform fast removal of the irrelevant variables. As our main theoretical contribution, we show that the S-OMP can be used to reduce an ultra-high num...

متن کامل

Variable selection in high-dimension with random designs and orthogonal matching pursuit

The performance of orthogonal matching pursuit (OMP) for variable selection is analyzed for random designs. When contrasted with the deterministic case, since the performance is here measured after averaging over the distribution of the design matrix, one can have far less stringent sparsity constraints on the coefficient vector. We demonstrate that for exact sparse vectors, the performance of ...

متن کامل

Identification of DNA regulatory motifs using Bayesian variable selection

MOTIVATION Understanding the mechanisms that determine gene expression regulation is an important and challenging problem. A common approach consists of identifying DNA-binding sites from a collection of co-regulated genes and their nearby non-coding DNA sequences. Here, we consider a regression model that linearly relates gene expression levels to a sequence matching score of nucleotide patter...

متن کامل

Note for Alan’s Class

Wavelets have proven to be immensely useful for signal analysis and representation [7]. Various dictionaries of wavelets have been designed for different types of signals or function spaces [3, 13]. Two key factors underlying the successes of wavelets are the sparsity of the representation and the efficiency of the analysis. Specifically, a signal can typically be represented by a linear superp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Statistics and Computing

دوره 21  شماره 

صفحات  -

تاریخ انتشار 2011